Goto

Collaborating Authors

 Spokane County


Generative AI hype distracts us from AI's more important breakthroughs

MIT Technology Review

It's a seductive distraction from the advances in AI that are most likely to improve or even save your life On April 28, 2022, at a highly anticipated concert in Spokane, Washington, the musician Paul McCartney astonished his audience with a groundbreaking application of AI: He began to perform with a lifelike depiction of his long-deceased musical partner, John Lennon. Using recent advances in audio and video processing, engineers had taken the pair's final performance (London, 1969), separated Lennon's voice and image from the original mix and restored them with lifelike clarity. For years, researchers like me had taught machines to "see" and "hear" in order to make such a moment possible. As McCartney and Lennon appeared to reunite across time and space, the arena fell silent; many in the crowd began to cry. As an AI scientist and lifelong Beatles fan, I felt profound gratitude that we could experience this truly life-changing moment. Later that year, the world was captivated by another major breakthrough: AI conversation.



Why Former NFL All-Pros Are Turning to Psychedelics

WIRED

Research into whether drugs like ayahuasca can mitigate the effects of traumatic brain injury is in its infancy. Pro athletes like the Buffalo Bills' Jordan Poyer are forging ahead anyway. Roam the wide-open halls and cavernous showrooms of the Colorado Convention Center during Psychedelic Science, the world's largest psychedelics conference, and you'll see exhibitors hawking everything from mushroom jewelry, to chewable gummies containing extracts of the psychoactive succulent plant kanna, to broad flat-brim baseball caps emblazoned with "MDMA" and "IBOGA." Booths publicize organizations such as the Ketamine Taskforce and the Psychedelic Parenthood Community, and even, a live-action feature film looking to attract investors. It's a motley, multifarious symposium where indigenous-plant-medicine healers mingle with lanyard-clad pharma-bros, legendary underground LSD chemists, and workaday stoners tottering around in massive red and white toadstool hats that make them look like that cute little mushroom guy from . And yet, oddest among such oddities may be the sight of enormously burly NFL tough guys talking candidly about their feelings.


Amazon's newest fulfillment robot has a sense of touch

Engadget

Amazon has deployed over 750,000 robots to its fulfillment centers over the last decade or so, but now there's a new, shall we say, more sensitive addition. The company has announced Vulcan, its first robot with a sense of touch. It's one in a series of new robots introduced today at Amazon's Delivering the Future event in Germany. Vulcan uses force feedback sensors to monitor how much it's pushing or holding on to an object and, ideally, not damage it. "In the past, when industrial robots have unexpected contact, they either emergency stop or smash through that contact. They often don't even know they have hit something because they cannot sense it."


Amazon Has Made a Robot With a Sense of Touch

WIRED

Amazon has developed a new warehouse robot that uses touch to rummage around shelves to find the right product to ship to customers. The robot, called Vulcan, is a meaningful step towards making robots less sausage-fingered compared to human beings. Honing robots' tactile abilities further may allow them to take on more fulfillment and manufacturing work in the years ahead. Aaron Parness, Amazon's director of robotics AI who led the development of Vulcan, explains that touch sensing helps the robot push items around on a shelf and identify what it's after. "When you're trying to stow [or pick] items in one of these pods, you can't really do that task without making contact with the other items," he says.


WavePulse: Real-time Content Analytics of Radio Livestreams

arXiv.org Artificial Intelligence

Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.


Deep Inverse Design for High-Level Synthesis

arXiv.org Artificial Intelligence

High-level synthesis (HLS) has significantly advanced the automation of digital circuits design, yet the need for expertise and time in pragma tuning remains challenging. Existing solutions for the design space exploration (DSE) adopt either heuristic methods, lacking essential information for further optimization potential, or predictive models, missing sufficient generalization due to the time-consuming nature of HLS and the exponential growth of the design space. To address these challenges, we propose Deep Inverse Design for HLS (DID4HLS), a novel approach that integrates graph neural networks and generative models. DID4HLS iteratively optimizes hardware designs aimed at compute-intensive algorithms by learning conditional distributions of design features from post-HLS data. Compared to four state-of-the-art DSE baselines, our method achieved an average improvement of 42.5% on average distance to reference set (ADRS) compared to the best-performing baselines across six benchmarks, while demonstrating high robustness and efficiency.


Uncovering Hidden Intentions: Exploring Prompt Recovery for Deeper Insights into Generated Texts

arXiv.org Artificial Intelligence

Today, the detection of AI-generated content is receiving more and more attention. Our idea is to go beyond detection and try to recover the prompt used to generate a text. This paper, to the best of our knowledge, introduces the first investigation in this particular domain without a closed set of tasks. Our goal is to study if this approach is promising. We experiment with zero-shot and few-shot in-context learning but also with LoRA fine-tuning. After that, we evaluate the benefits of using a semi-synthetic dataset. For this first study, we limit ourselves to text generated by a single model. The results show that it is possible to recover the original prompt with a reasonable degree of accuracy.


Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations

arXiv.org Artificial Intelligence

The robustness of recent Large Language Models (LLMs) has become increasingly crucial as their applicability expands across various domains and real-world applications. Retrieval-Augmented Generation (RAG) is a promising solution for addressing the limitations of LLMs, yet existing studies on the robustness of RAG often overlook the interconnected relationships between RAG components or the potential threats prevalent in real-world databases, such as minor textual errors. In this work, we investigate two underexplored aspects when assessing the robustness of RAG: 1) vulnerability to noisy documents through low-level perturbations and 2) a holistic evaluation of RAG robustness. Furthermore, we introduce a novel attack method, the Genetic Attack on RAG (\textit{GARAG}), which targets these aspects. Specifically, GARAG is designed to reveal vulnerabilities within each component and test the overall system functionality against noisy documents. We validate RAG robustness by applying our \textit{GARAG} to standard QA datasets, incorporating diverse retrievers and LLMs. The experimental results show that GARAG consistently achieves high attack success rates. Also, it significantly devastates the performance of each component and their synergy, highlighting the substantial risk that minor textual inaccuracies pose in disrupting RAG systems in the real world.


Comparative Analysis of ImageNet Pre-Trained Deep Learning Models and DINOv2 in Medical Imaging Classification

arXiv.org Artificial Intelligence

Medical image analysis frequently encounters data scarcity challenges. Transfer learning has been effective in addressing this issue while conserving computational resources. The recent advent of foundational models like the DINOv2, which uses the vision transformer architecture, has opened new opportunities in the field and gathered significant interest. However, DINOv2's performance on clinical data still needs to be verified. In this paper, we performed a glioma grading task using three clinical modalities of brain MRI data. We compared the performance of various pre-trained deep learning models, including those based on ImageNet and DINOv2, in a transfer learning context. Our focus was on understanding the impact of the freezing mechanism on performance. We also validated our findings on three other types of public datasets: chest radiography, fundus radiography, and dermoscopy. Our findings indicate that in our clinical dataset, DINOv2's performance was not as strong as ImageNet-based pre-trained models, whereas in public datasets, DINOv2 generally outperformed other models, especially when using the frozen mechanism. Similar performance was observed with various sizes of DINOv2 models across different tasks. In summary, DINOv2 is viable for medical image classification tasks, particularly with data resembling natural images. However, its effectiveness may vary with data that significantly differs from natural images such as MRI. In addition, employing smaller versions of the model can be adequate for medical task, offering resource-saving benefits. Our codes are available at https://github.com/GuanghuiFU/medical_DINOv2_eval.